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Summary 

Time- and state-domain methods are two common approaches for nonpar ametric prediction. 
The former predominantly uses the data in the recent history while the latter mainly relies on 
historical information. The question of combining these two pieces of valuable information is an 
interesting challenge in statistics. We surmount this problem via dynamically integrating informa- 
tion from both the time and the state domains. The estimators from both domains are optimally 
combined based on a data driven weighting strategy, which provides a more efficient estimator of 
volatility. Asymptotic normality is seperately established for the time damain, the state domain, 
and the integrated estimators. By comparing the efficiency of the estimators, it is demonstrated 
that the proposed integrated estimator uniformly dominates the two other estimators. The pro- 
posed dynamic integration approach is also applicable to other estimation problems in time series. 
Extensive simulations are conducted to demonstrate that the newly proposed procedure outper- 
forms some popular ones such as the RiskMetrics and the historical simulation approaches, among 
others. Empirical studies endorse convincingly our integration method. 
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1 Introduction 



In forecasting a future event or making an investment decision, two pieces of useful information 
are frequently consulted. Based on the recent history, one uses a form of local average, such as the 
moving average in the time-domain, to forecast a future event. This approach uses the continuity of 
a function and ignores completely the information in the remote history, which is related to current 
through stationarity. On the other hand, one can forecast a future event based on state-domain 
modeling such as the ARMA, TAR, ARCH models or nonparametric models (see Tong, 1990; Fan & 
Yao, 2003 for details). For example, to forecast the volatility of the yields of a bond with the current 
rate 6.47%, one computes the standard deviation based on the historical information with yields 
around 6.47%. This approach relies on the stationarity and depends completely on historical data. 
But, it ignores the importance of the recent data. The question of how to combine the estimators 
from both the time-domain and the state-domain poses an interesting challenge to statisticians. 



(a) Yields of Treasury Bills from 1954 to 2004 
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Figure 1: Illustration of time and state-domain estimation, (a) The yields of 3- month treasury bills from 
1954 to 2004. The vertical bar indicates localization in time and the horizontal bar represents localization in 
the state. (1)) Illustration of time-domain smoothing: squared differences are plotted against its time index 
and the exponential weights are used to compute the local average, (c) Illustration of the state-domain 
smoothing: squared differences are plotted against the level of interest rates, restricted to the interval 
6.47% ± .25% indicated by the horizontal bar in Figure 1(a). The Epanechnikov kernel is used for computing 
the local average. 



To elucidate our idea, consider the weekly data on the yields of 3-month treasury bills presented 



in Figure 1. Suppose that the current time is January 04, 1991 and interest rate is 6.47% on that 
day, corresponding to the time index t = 1930. One may estimate the volatility based on the 
weighted squared differences in the past 52 weeks (1 year), say. This corresponds to the time- 
domain smoothing, using a small vertical stretch of data in Figure 1(a). Figure 1(b) computes 
the squared differences of the past year's data and depicts its associated exponential weights. 
The estimated volatility (conditional variance) is indicated by the dashed horizontal bar. Let the 
resulting estimator be of time . On the other hand, in financial activities, we do consult historical 
information in making better decisions. The current interest rate is 6.47%. One may examine 
the volatility of the yields when the interest rates are around 6.47%, say, 6.47% ± .25%. This 
corresponds to using the part of data indicated by the horizontal bar. Figure 1(c) plots the squared 
differences Xt — Xt-i against Xf-i with X-t-\ restricted to the interval 6.47% ± .25%. Applying the 
local kernel weight to the squared differences results in a state-domain estimator of state , indicated 
by the horizontal bar in Figure 1(c). Clearly, as shown in Figure 1(a), except in the 3- week period 
right before January 4, 1991 (which can be excluded in the state domain fitting), the last period 
with interest rate around 6.47% ± .25% is the period from May 15, 1988 and July 22, 1988. Hence, 
the time and state-domain estimators use two nearly independent components of the time series, as 
they are 136- week apart in time. See the horizontal and vertical bars of Figure 1(a). These two kinds 
of estimators have been used in the literature for forecasting volatility. The former is prominently 
featured in the RiskMetrics of J. P. Morgan, and the latter has been used in nonparametric regression 
(see Tong, 1995; Fan & Yao, 2003 and references therein). The question arises how to integrate 
them. 

An integrated estimator is to introduce a dynamic weighting scheme < Wt < 1 to combine the 
two nearly independent estimators. Define the resulting integrated estimators as 

of = w t al timc + (1 - w t )a ttBtSite . 

The question is how to choose the dynamic weight wt to optimize the performance. A reasonable 
approach is to minimize the variance of the combined estimator, leading to the dynamic optimal 
weights 

_ Var(of statc ) 
Wt ~ Var^J+Var^J' 1 > 

since the two piece of estimators are nearly independent. The unknown variances in Q can easily be 
estimated in Section 3. Another approach is the Bayesian approach, which regards the historical 
information as the prior. We will explore this idea in Section 4. The proposed method is also 
applicable to other estimation problems in time series such as forecasting the mean function and 
the volatility matrix of multivariate time series. 

To appreciate the intuition behind our approach, let us consider the diffusion process 

dr t = n(r t )dt + <T(r t )dW t , (2) 



where Wt is a Wiener process. This diffusion process is frequently used to model asset price and 
the yields of bonds, which are fundamental to fixed income securities, financial markets, consumer 
spending, corporate earnings, asset pricing and inflation. The family of models include famous ones 
such as the Vasicek (1977) model, the CIR model (Cox, et al. 1985) and the CKLS model (Chan, 
et al. 1992). Suppose that at time t we have a historic data {rt i }^. fr° m the process (j2J with a 
sampling interval A. Our aim is to estimate the volatility a\ = a 2 (rt). Let Y\ = A~ 1 / 2 (r^ i+1 — r^). 
Then for the model ©, the Euler approximation scheme is 



where £j ^%.%.d. N(Q, 1) for i = 0, ■ ■ ■ , N — 1. Fan & Zhang (2003) studied the impact of the order 
of difference on statistical estimation. They found that while higher order can possibly reduce 
approximation errors, it increases variances of data substantially. They recommended the Euler 
scheme (J3J) for most practical situations. The time-domain smoothing relies on the smoothness of 
cr(r^) as a function of time tj. This leads to the exponential smoothing estimator in Section 2.1. 
On the other hand, the state-domain smoothing relies on structural invariability implied by the 
stationarity: the conditional variance of Yi given remains the same even for the data in the 
history. In other words, historical data also furnish the information about cr(-) at the current time. 
Combining these two nearly independent estimators leads to a better estimator. 

In this paper, we focus on the estimation of volatility of a portfolio to illustrate how to deal 
with the problem of dynamic integration. Asymptotic normality of the proposed estimator is estab- 
lished and extensive simulations are conducted, which theoretically and empirically demonstrate 
the dominated performance of the integrated estimation. 



The volatility estimation is an important issue of modern financial analysis since it pervades 
almost every facet of this field. It is a measure of risk of a portfolio and is related to the Value- 
at-Risk (VaR), asset pricing, portfolio allocation, capital requirement and risk adjusted returns, 
among others. There is a large literature on estimating the volatility based on time-domain and 
state-domain smoothing. For an overview, see the recent book by Fan & Yao (2003). 

2.1 Time-domain estimator 

A popular version of time-domain estimator of the volatility is the moving average estimator: 

t-i 

*MA,t = n~ l £ Yh (4) 

i=t— n 




(3) 



2 Estimation of Volatility 



where n is the size of the moving window. This estimator ignores the drift component, which 
contributes to the variance in the order of 0(A) instead of 0{A l l 2 ) ( see Stanton, 1997 and Fan & 
Zhang, 2003), and utilizes local n data points. An extension of the moving average estimator is the 
exponential smoothing estimation of the volatility given by 

8%s,t = (1 - Wli + ^ls,t-i = (1 " ^){Yli + Alf_ 2 + A 2 Y t 2 _ 3 + •••}, (5) 

where A is a smoothing parameter that controls the size of the local neighborhood. The RiskMetrics 
of J. P. Morgan (1996), which is used for measuring the risks, called Value at Risk (VaR), of financial 
assets, recommends A = 0.94 and A = 0.97 respectively for calculating VaR of the daily and monthly 
returns. 

The exponential smoothing estimator in © is a weighted sum of the squared returns prior to 
time t. Since the weight decays exponentially, it essentially uses recent data. A slightly modified 
version that explicitly uses only n data points before time t is 

i=i 

When A = 1, it becomes the moving average estimator Q. With slight abuse of notation, we will 
also denote the estimator for cr 2 (r t ) as u\ st - 

All of the time domain smoothing is based on the assumption that the returns Yt_i, Yt-2, 
• • • , Yt- n have approximately the same volatility. In other words, o~(rt) in (^Q) is continuous in time 
t. The following proposition gives the condition under which this holds. 
Proposition 1 Under Conditions (Al) and (A2) in the Appendix, we have 

\o 2 (r s ) - a 2 (r u )\ < K\s - n| (p " 1)/(2p) , 

for any s,u £ [t — rj,t], where the coefficient K satisfies E[K 2<yP+ ^] < oo and rj is a positive constant. 

With the above Holder continuity, we can establish the asymptotic normality of the time-domain 
estimator. 

Theorem 1 Suppose that a 2 > 0. Under conditions (Al) and (A2), if n — > +oo and nA — > 0, 
then 

osst-v* — >°> a - e - 
Moreover, if the limit c = linin^oo n(l — A) exists and nA^ p ~ 1 ^^ 2p ~ 1 ^ — ► 0, 

\fr[6ES,t-°t]/*i,t AT (0,1), 

where sf t = ca^-^i- 



Theorem ^ has very interesting implications. Even though the data in the local time- window 
is highly correlated (indeed, the correlation tending to one), we can compute the variance as if the 
data were independent. Indeed, if the data in © were independent and locally homogeneous, we 
have 

i=l 

2^(1-A)(1 + A") ^ 1 2 
(1 + A)(l-A n ) ~ n 
This is indeed the asymptotic variance given in Theorem ^ 

2.2 Estimation in state-domain 

To obtain the nonparametric estimation of the functions f(x) = A 1 / 2 ^x(x) and o~ 2 (x) in (J3J), 
we use the local linear smoother studied in Ruppert et al. (1997) and Fan & Yao (1998). The 
local linear technique is chosen for its several nice properties, such as the asymptotic minimax 
efficiency and the design adaptation. Further, it automatically corrects edge effects and facilitates 
the bandwidth selection (Fan & Yao, 2003). 

To facilitate the theoretical argument in Section 3, we exclude the n data points used in the 
time-domain fitting. Thus, the historical data at time t are {(r^, Yj),i = 0, • • • , N — n — 1}. Let 
f(x) = di be the local linear estimator that solves the following weighted least-squares problem: 

N-n-l 

(di, d 2 ) = arg min V" [Yj - a x - a 2 {r u - x)] 2 K hl (r u - x), 

1=0 

where K(-) is a kernel function and hi > is a bandwidth. Denote the squared residuals by 
Ri = {Y — f( r ti)} 2 - Then the local linear estimator of a 2 {x) is o"|(x) = $q given by 

N-n-l 

00, (h) = argmin V {Ri - (3 - 1 {r ti - x)} 2 W h {r u - x) (7) 
a ^ to 

with kernel function W and bandwidth h. Fan & Yao (1998) gives strategies of bandwidth selection. 
It was shown in Stanton (1997) and Fan &; Zhang (2003) that Y 2 instead of Ri in (J7J) can also be 
used for the estimation of a 2 [x). 

The asymptotic bias and variance of &g{x) are given by Fan & Zhang (2003, theorem 4). Set 
Vj = f u^W 2 (u)du for j = 0,1,2. Let p(-) the invariant density function of the Markov process 
{r s } from (|T|). Then, we have 

Theorem 2 Let x be in the interior of the support ofp(-). Suppose that the second derivatives fi(-) 
and a 2 {-) exist in a neighborhood of x. Under conditions (A3)-(A7), we have 

y/{N - n)h[a 2 s (x) - a 2 (x)]/s 2 (x) ^ N (0, 1) , 

where s 2 (x) = Ivqo^ (x) / p(x) . 



3 Dynamic Integration of time and state domain estimators 



In this section, we first show how the optimal dynamic weights in Q can be estimated and then 
prove that the time-domain and state-domain estimator are indeed asymptotically independent. 



3.1 Estimation of dynamic weights 

For the exponential smoothing estimator in (JHJ, we can apply the asymptotic formula given 
in Theorem ^ to get an estimate of its asymptotic variance. However, since the estimator is a 
weighted average of Y^_^ we can obtain its variance directly by assuming Yt-j ~ -Y(0, of) for small 
j. Indeed, with the above local homogeneous model, we have 

v > i=i j=i 



T 



2(1 - A) V 



V j (x)=J2(rt i -x) j W 



i-^l{l + 2£»A*(1 - A 2 ^))/(l - A 2 )}, (8) 
' k=i 

where p(j) = Cor(Y^ 2 , Y^_a) is the autocorrelation of the series {Y^_j}. The autocorrelation can be 
estimated from the data in history. Note that due to the locality of the exponential smoothing, 
only /c(j)'s with the first 30 lags, say, contribute to the variance calculation. 

We now turn to estimate the variance of a% t = o|(rt). Details can be found in Fan &; Yao 
(1998) and §6.2 of Fan & Yao (2003). Let 

t-i 

'Tt- — X" 

JU ~ xy vv y 
i=i 

and 

£i(x) = w{p^){v 2 [x) ~ (ru - x)y 1 (x)}/{y (x)y 2 (x) - v^xf}. 

Then the local linear estimator can be expressed as 

t-i 

i=l 

and its variance can be approximated as 

t-i 

Var(o|(x)) « Var{(Y 1 - /(x)) 2 |r tl = x}^£ 2 (x). (9) 

i=l 

See also Figure 1 and the discussions at the end of §2.1. Again, for simplicity, we assume that 
V&r(Ri\r u = x) « 2o 4 (x), which holds if e t ~ N(0, 1). 

Combining (JSJ) and Q, we propose to combine the time-domain and the state-domain 
estimator with the dynamic weight 

*5,tEi=l4 ? (n) . 

°s,t z2i=iti(n) + c t o% st 



where c t = (fe^r{l + 2 Efe=iV(^)A fc (l - A 2 ( n " fc ))/(1 - A 2 )} [see ©]. This is obtained by substi- 
tuting (jHJ) and © into For practical implementation, we truncate the series {p(i)}lz\ in the 
summation as {/)(i)}f£ 1 . This results in the dynamically integrated estimator 

oj.t = wt^ls.t + (l - ^t)^, (11) 

where d% t = &g(rt). The function <t|(-) depends on the time t and we need to update this function 
as time evolves. Fortunately, we need only to know the function at the point rt- This reduces 
significantly the computational cost. The computational cost can be reduced further, if we update 
the estimated function <r| t at a prescribed time schedule (e.g. once every two months for weekly 
data). 

Finally, we would like to note that in the choice of weight, only the variance of the estimated 
volatility is considered, rather than the mean square error. This is mainly to facilitate the dynam- 
ically weighted procedure. Since the smoothing parameters in o%a t and 65(3;) have been tuned to 
optimize their performance separately, their biases and variances trade-off have been considered. 
Hence, controlling the variance of the integrated estimator a\ t has also controlled, to some extent, 
the bias of the estimator. Our method focuses only on the estimation of volatility, but the method 
can be adapted to other estimation problems, such as the value at risk studied in Dufne &: Pan 
(1997) and the drift estimation for diffusion considered in Spokoiny (2000) and volatility matrix 
for multivariate time series. Further study along this topic is beyond the scope of the current 
investigation. 

3.2 Sampling properties 

The fundamental component to the choice of dynamic weights is the asymptotic independent 
between the time and state-domain estimator. By ignoring the drift term (see Stanton, 1997; Fan 
&; Zhang 2003), both the estimators c>\ st and a 2 st are linear in {5^ 2 }. The following theorem 
shows that the time-domain and state-domain estimators are indeed asymptotically independent. 
To facilitate the notation, we present the result at the current time tjv- 

Theorem 3 Let S2 t t N = S2(rt N )- Under the conditions of Theorems^ and\^ if the condition (A2) 
holds at point t^, we have 

(a) asymptotic independence: 

[yft{*ES,t N ~ °t„)/*i,t N , V(N-n)h(a 2 s>tN - al)/s 2 , tN ] T ^ M(0, h). 

(b) asymptotic normality of aj t : if the limit d = limjv-»oo w/[(iV — n)h] exists, then 

^(N-n)h/co[aj itN - aj N )] - AA(0, 1), 
where u = w? N s 2 l tN /d + (1 - w tN ) 2 s\ tN . 



From Theorem |3J based on the optimal weight the asymptotic relative efficiencies of aj t with 
)ect to 5"| t and cr 2 EStN are respectively 

eS (°lt N ,°hJ = l + ds l,t N / s l,t N i and eS (°lt N ,°Es,t N ) = 1 + s i,t N /(dslt N ), 



which are greater than one. This demonstrates that the integrated estimator a\ t is more efficient 
than the time domain and the state domain estimators. 

4 Bayesian integration of volatility estiamtes 

Another possible approach is to consider the historical information as the prior and to incor- 
porate them in the estimation of volatility by the Bayesian framework. We now explore such an 
approach. 

4.1 Bayesian estimation of volatility 

The Bayesian approach is to regard the recent data Yt- n , • • • , Yt-\ as an independent sample 
from iV(0, a 2 ) [see ©] and to regard the historical information being summarized in a prior. To 
incorporate historical information, we assume that the variance a 2 follows an Inverse Gamma 
distribution with parameters a and b, which has the density function 

f(a 2 ) = b^\a){a 2 y {a+1) eM-b/a 2 ). 
Denote by a 2 ~ IG(a, b). It is a well-known fact that 

E(a 2 ) = — b — Var(<7 2 ) = 7 -, mode(a 2 ) = j-^—. (12) 

[a — 1) {a — \y{a — l) {a + lj 

The hyperparameters a and b will be estimated from historical data such as the state-domain 
estimators. 

It can easily be shown that the posterior density of a 2 given Y = {Yt-n, • • • , Y t -\) is IG(a*, &*), 
where 



i 



a = a , 

2 2 

i=i 



From (|12[) . the Bayesian mean of a is 



^ ' i=l 



This Bayesian estimator can easily be written as 



^ n -2 , 2(o-l) 2 

n + 2{a— 1) ' n + 2{a—l) 



where cf\u^ t is the moving average estimator given by (jlj) and Up = b/(a — 1) is the prior mean, 
which will be determined from the historical data. This combines the estimate based on the data 
and prior knowledge. 

The Bayesian estimator ()14|) utilizes the local average of n data points. To incorporate the 
exponential smoothing estimator (J5J), we regard it as the local average of 

n 1 \ n 

»* = E^ = ^ < 14 > 

i=l 

data points. This leads to the following integrated estimator 

-2 n * -2 2(a - 1) „ 2 

a B,t = n * + 2 ( a -lf ES ' t+ 2(a-l)+n* ap 

1 ~ A" 2 2(o-l)(l-A) 2 

" i _ A n + 2 (a _i)(i_ A) ^ 1 - A n + 2(a - 1)(1 - A) 

In particular, when A = 1, the estimator (|15j) reduces to (|13|) . 

4.2 Estimation of Prior Parameters 

A reasonable source for obtaining the prior information in (|15J) is based on the historical data 
up to time t. Hence, the hyper-parameters a and b should depend on t and can be used to match 
with the historical information. Using the approximation model ©, we have 

E[{Y t - f{r t )f | r t ] « C7 2 (r t ) Var[(Ft - f(r t )) 2 \ r t ] « 2 C r 4 (r i )- 

These can be estimated from the historical data up to time t, namely, the state-domain estimator 
&g(rt)- Since we have assumed that prior distribution for of is IG(a<,&t), then by the method of 
moments, we would set 

*(•?>- j£t-«m. 

V "<*<> = (■» -!)»(,. -2) " 
Solving the above equation, we obtain that 

fit = 2.5 and 6t = l-5fi|(rt). 

Substituting this into ()15j) . we obtain the following estimator 

.2 1-A n , 2 3(1 - A) , 2 

^ " i_a- + 3(1-A)^ + l-Xn + ^i-xf^f ( 16 ) 



Unfortunately, the weights in (JT^J) are static, which does not depend on the time t. Hence, the 
Bayesian method does not produce a satisfactory answer to this problem. 



5 Numerical Analysis 



To facilitate the presentation, we use the simple abbreviation in Table Qto denote five volatility 
estimation methods. Details of the first three methods can be found in Fan & Gu (2003). In 
particular, the first method is to estimate the volatility using the standard deviation of the yields 
in the past year and the RiskMetrics method is based on the exponential smoothing with A = 0.94. 
The semiparametric method of Fan & Gu (2003) is an extension of a local model used in the 
exponential smoothing, with the smoothing parameter determined by minimizing the prediction 
error. It includes the exponential smoothing with A selected by data as a specific example. 

Table 1: Abbreviations of five volatility estimators 
Hist: the historical method 
RiskM: the RiskMetrics method of J. P. Morgan 
Semi: the semiparametric estimator (SEV) in Fan & Gu (2003) 
NonBay: the nonparametric Bayesian method in 1 )16)1 with A = 0.94 
Integ: the integration method of time and state domains in 



The following four measures are employed to assess the performance of different procedures for 
estimating the volatility. Other related measures can also be used. See Dave & Stahl (1997). 
Measure 1. Exceedence ratio against confidence level. 

This measure counts the number of the events for which the loss of an asset exceeds the loss 
predicted by the normal model at a given confidence a. With estimated volatility, under the 
normal model, the one-period VaR is estimated by <&~ l (a)&t, where <3? _1 (a) is the a quantile of the 
standard normal distribution. For each estimated VaR, the Exceedence Ratio (ER) is computed as 

T+m 

ER(«j t 2 ) = m- 1 < *"V)*0> ( 17 ) 

i=T+l 

for an out-sample of size m. This gives an indication on how effective the volatility estimator can 
be used for predicting the one-period VaR. Note that the Monte Carlo error for this measure has 
an approximate size {a(l — a^/m} 1 ' 2 , even when the true at is used. For example, with a = 5% 
and m = 1000, the Monte Carlo error is around 0.68%. Thus, unless the post-sample size m is 
large enough, this measure has difficulty in differentiating the performance of various estimators 
due to the presence of large error margins. Note that the ER depends strongly on the assumption 
of normality. If the underlying return process is non-normal, the Student's t(5) say, the ER will 
grossly be overestimated even with the true volatility. In our simulation study, we will employ the 
true cv-quantile of the error distribution instead of <£ _1 (a) in 1)1 7J) to compute the ER. For real data 
analysis, we use the a-quantile of the last 250 residuals for the in-sample data. 



Measure 2. Mean Absolute Deviation Error. 

To motivate this measure, let us first consider the mean square errors: 

T+m 

PE(<x?) = m" 1 £ (Y? - aff. 

i=T+l 

The expected value can be decomposed as 

T+m T+m 

e(pe) = m- 1 y: e ^ - + £ E{ y? - ( 18 ) 

t=T+l i=T+l 

Note that the first term reflects the effectiveness of the estimated volatility while the second term is 
the size of the stochastic error, independent of estimators. As in all statistical prediction problems, 
the second term is usually of an order of magnitude larger than the first term. Thus, a small 
improvement on PE could mean substantial improvement over the estimated volatility. However, 
due to the well-known fact that financial time series contain outliers, the mean-square error is not 
a robust measure. Therefore, we used the mean-absolute deviation error (MADE): 

T+m 

MADE(of ) = m" 1 £ | Y? — a\ 



i=T+l 



Measure 3. Square-root Absolute Deviation Error. 

An alternative variation to MADE is the square- Root Absolute Deviation Error (RADE) , which 
is defined as 

T+m . /y 
7T 



RADE(of ) = m" 1 £ | Y> \ -\/ -a 

i=T+l 



The constant factor comes from the fact that E\e t \ = y ^ for et ~ N(0, 1). If the underlying error 
distribution deviates from normality, this measure is not robust. 
Measure 4. Ideal Mean Absolute Deviation Error. 

To assess the estimation of the volatility in simulations, one can also employ the ideal mean 
absolute deviation error (IMADE): 

T+m 

IMADE = m~ l Y 

i=T+l 

This measure calibrates the accuracy of the forecasted volatility in terms of the absolute difference 
between the true and the forecasted one. However, for real data analysis, this measure is not 
applicable. 



5.1 Simulations 

To assess the performance of the five estimation methods in Tabled we compute the average 
and the standard deviation of each of the four measures over 600 simulations. Generally speaking, 



the smaller the average (or the standard deviation), the better the estimation approach. We also 
compute the "score" of an estimator, which is the percentage of times among 600 simulations that 
the estimator outperforms the average of the 5 methods in terms of an effectiveness measure. To 
be more specific, for example, consider RiskMetrics using MADE as an effectiveness measure. Let 
mi be the MADE of the RiskMetrics estimator at the i-th simulation, and fhi the average of the 
MADEs for the five estimators at the i-th. simulation. Then the "score" of the RiskMetrics approach 
in terms of the MADE is defined as 

j 600 

> Iirrii < fhi). 

Obviously, the estimators with higher scores are preferred. In addition, we define a "relative loss" 
of an estimator a 2 relative to o 2 t in terms of MADEs as 



mnqqr2 ~ 2 , MADE(<3f ) - MADE(oj t ) 

RLOSbloT a j A = ^=^= — — . 

V * ht> MADE(af f ) 



where MADE (of ) is the average of MADE(of ) among simulations. 

Example 1. To simulate the interest rate data, we consider the Cox-Ingersoll-Ross (CIR) 
model: 

dr t = k(6 - r t )dt + ar] /2 dW t , t > t , 

where the spot rate, rt, moves around a central location or long-run equilibrium level 9 = 0.08571 at 
speed k = 0.21459. The o is set to be 0.07830. These values of parameters are cited from Chapman 
& Pearson (2000), which satisfy the condition 2k9 > a 2 so that the process rt is stationary and 
positive. The model has been studied by Chapman & Pearson (2000) and Fan & Zhang (2003). 

There are two methods to generate samples from this model. The first one is the discrete-time 
order 1.0 strong approximation scheme in Kloeden, et al. (1996); the second one is using the exact 
transition density detailed in Cox et al. (1985) and Fan & Zhang (2003). Here we use the first 
method to generate 600 series of data each with length 1200 of the weekly data from this model. 
For each simulation, we set the first 900 observations as the "in-sample" data and the last 300 
observations as the "out-sample" data. 

The results are summarized in Table [2j which shows that the performance of the integrated 
estimator uniformly dominates the other estimators because of its highest score, lowest IMADE, 
MADE, and RADE. The improvement in IMADE is over 100 percent. This shows that our inte- 
grated volatility method better captures the volatility dynamics. The Bayesian method of combin- 
ing the estimates from the time and state domains outperforms all other methods. The historical 
simulation method performed poorly due to mis-specification of the function of the volatility pa- 
rameter. The results here show the advantage of aggregating the information of time domain and 



Table 2: Comparisons of several volatility estimation methods 



Measure 


Empirical Formula 


Hist 


RiskM 


Semi 


NonBay 


Iuteg 




Score (%) 


17.17 


20.83 


32.00 


44.33 


99.83 


IMADE 


Ave (xl0~ 5 ) 


0.2383 


0.2088 


0.1922 


0.1833 


0.0879 




Std (xl0~ 5 ) 


0.1087 


0.0746 


0.0718 


0.0675 


0.0554 




Relative Loss (%) 


171.20 


137.61 


118.79 


108.60 







Score (%) 


39.83 


54.33 


60.00 


57.17 


72.17 


MADE 


Ave (xlO^ 4 ) 


0.1012 


0.0930 


0.0932 


0.0924 


0.0903 




Std(xlCT 5 ) 


0.3231 


0.3152 


0.3010 


0.3119 


0.2995 




Relative Loss (%) 


12.03 


2.95 


3.16 


2.31 







Score (%) 


40.83 


53.33 


54.83 


57.50 


74.50 


RADE 


Ave 


0.0015 


0.0015 


0.0015 


0.0015 


0.0014 




Std (xlO- 3 ) 


0.2530 


0.2552 


0.2461 


0.2536 


0.2476 




Relative Loss (%) 


6.88 


1.66 


2.13 


1.27 





ER 


Ave 


0.0556 


0.0547 


0.0536 


0.0535 


0.0508 




Std 


0.0257 


0.0106 


0.0122 


0.0107 


0.0122 



state domain. Note that all estimators have reasonable ER values at level 0.05, especially the ER 
value of the integrated estimator is closest to 0.05. To appreciate how much improvement for our 
integrated method over the other methods, we display the mean absolute difference between the 
forecasted and the true volatility in Figure^ It is seen that the integrated method is much better 
than the others in terms of the difference. 

Example 2. There is a large literature on the estimation of volatility. In addition to the famous 
parametric models such as ARCH and GARCH, stochastic volatility models have also received a 
lot of attention. For an overview, see, for example, Barndoff-Neilsen &: Shephard (2001, 2002), 
Bollerslev &: Zhou (2002) and references therein. We consider the following stochastic volatility 
model: 

dr t = a t dBt, r = 

dV t = k(8 - V t )dt + aV t dW t , Vq = r/, V t = a 2 , 

where Wt and Bt are two independent standard Brownian motions. 

There are two methods to generate samples from this model. One is the direct method, using 
the result of Genon-Catalot et al. (1999). Let a = 1 + 2k/o 2 and b = 29k/q. 2 . The conditions 
(Al)-(A4) in the above paper are satisfied with the parameter values in the model being constants 
as k = 3, 9 = 0.009 and a 2 = 4 and the initial random variable rj follows the Inverse Gamma 
distribution. The value of 9 is set as the real variance of the daily return for Standard & Poor 500 
data from January 4, 1988 to December 29, 2000. The value a 2 is to make the parameter a of the 
stable distribution IG(a, b) equal 2.5, the prior parameter in (jlOj) . If A — ► and nA — ► oo, then 




T, where T ~ t(2a). 



The mean absolute difference 
between the forecasted and the true volatility 
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Figure 2: The mean absolute difference between the forecasted and the true volatility. Solid - 
integrated estimator (11); small circle - nonparametric Bayesian integrated estimator (16); star - 
historical method; dashed - RiskMetrics; dash dotted - Semiparametric estimator in Fan & Gu 
(2003). 

Another method is the discretization of the model. Conditionally on g = a(Vt,t > 0), the 
random variables Yi are independent and follow N(0, Vi) with 

1 f A 
Vi = - V s ds. 

A J(i-1)A 

To simulate the diffusion process Vt, one can use the following order 1.0 scheme with sampling 
interval A* = A/30, 

V i+A * =Vi + k(0 - Vi)A* + aV i (A*) 1 / 2 e l + ^a 2 ViA*(e 2 - 1), 

where {e^} are independent random series from the standard normal distribution. 

We simulate 600 series of 1000 monthly data using the second method with step size A = 1/12. 
For each simulated series, set the first three quarters observations as the in-sample data and the 
remaining observations as the out-sample data. The performance of each volatility estimation is 
described in Table El The conclusion similar to Example 1 can be drawn from this example. 

Example 3. We now consider the geometric Brownian (GBM): 

dr t = fir t + ar t dW tl 

where W% is a standard one-dimensional Brownian motion. This is a non-stationary process to 
which we check if our method continues to apply. Note that the celebrated Black-Scholes option 




Table 3: Comparisons of several volatility estimation methods 



Measure 


Empirical Formula 


Hist 


RiskM 


Semi 


NonBay 


Iuteg 




Score (%) 


27.67 


49.33 


52.83 


58.83 


77.17 


IMADE 


Ave 


0.0056 


0.0051 


0.0051 


0.0050 


0.0047 




Std 


0.0023 


0.0019 


0.0021 


0.0018 


0.0016 




Relative Loss (%) 


17.74 


7.63 


6.56 


5.18 







Score (%) 


35.33 


52.17 


57.67 


58.00 


82.67 


MADE 


Ave 


0.0099 


0.0089 


0.0087 


0.0088 


0.0082 




Std 


0.0032 


0.0022 


0.0022 


0.0021 


0.0017 




Relative Loss (%) 


20.48 


7.53 


5.38 


6.17 







Score (%) 


33.00 


49.17 


53.33 


58.83 


81.33 


RADE 


Ave 


0.0477 


0.0455 


0.0452 


0.0451 


0.0438 




Std 


0.0059 


0.0051 


0.0051 


0.0049 


0.0042 




Relative Loss (%) 


8.77 


3.70 


3.11 


2.91 





ER 


Ave 


0.0457 


0.0547 


0.0546 


0.0516 


0.0533 




Std 


0.0156 


0.0126 


0.0143 


0.0127 


0.0146 



price formula is derived on the Osborne's assumption that the stock price follows the GBM model. 
By the ltd formula, we have 

log r t - log r = - a 2 /2)t + a 2 W t . 

We set [X = 0.03 and a = 0.26 in our simulations. With the Brownian motion simulated from 
independent Gaussian increments, one can generate the samples for the GBM. Here we use the 
latter with A = 1/52 in 600 simulations. For each simulation, we generate 1000 observations 
and use the first two thirds of observations as in-sample data and the remaining observations as 
out-sample data. 



Table 4: Comparisons of several volatility estimation methods 



Measure 


Empirical Formula 


Hist 


RiskM 


Semi 


NonBay 


Iuteg 




Score (%) 


2.17 


89.98 


7.01 


99.17 


99.17 


IMADE 


Ave (xl0~ 5 ) 


0.1615 


0.0811 


0.1154 


0.0746 


0.0746 




Std (xlO- 4 ) 


0.1030 


0.0473 


0.0632 


0.0440 


0.0440 




Relative Loss (%) 


116.42 


8.64 


54.63 










Score (%) 


40.17 


58.67 


54.00 


60.00 


66.17 


MADE 


Ave (xl0~ 5 ) 


0.2424 


0.2984 


0.2896 


0.2958 


0.2859 




Std(xl0- 4 ) 


0.1037 


0.1739 


0.1633 


0.1723 


0.1663 




Relative Loss (%) 


-15.24 


4.35 


1.30 


3.46 







Score (%) 


36.83 


60.17 


47.50 


62.33 


69.50 


RADE 


Ave (xlO" 3 ) 


0.5236 


0.4997 


0.5114 


0.4975 


0.4903 




Std (xlO" 3 ) 


0.5898 


0.6608 


0.6567 


0.6573 


0.6435 




Relative Loss (%) 


6.80 


1.92 


4.30 


1.47 





ER 


Ave 


0.0693 


0.0532 


0.0517 


0.0506 


0.0444 




Std 


0.0467 


0.0095 


0.0219 


0.0110 


0.0160 



Table 5: Robust comparisons of several volatility estimation methods 



Measure 


Empirical Formula 


Hist 


RiskM 


Semi 


NonBay 


Iuteg 


IMADE 


Ave (xl(T 6 ) 
Relative Loss (%) 


0.5579 
103.01 


0.3025 
10.08 


0.4374 
59.17 


0.2748 



0.2748 



MADE 


Ave (xlO" 5 ) 
Relative Loss (%) 


0.1115 
5.07 


0.1107 
4.30 


0.1111 

4.67 


0.1097 
3.42 


0.1061 



RADE 


Ave (xlO" 3 ) 
Relative Loss (%) 


0.4268 
11.27 


0.3901 
1.71 


0.4028 
5.00 


0.3885 
1.28 


0.3836 



ER 


Ave 


0.0628 


0.0521 


0.0493 


0.0494 


0.0428 



Table |1] summarizes the results. The historical simulation approach has the smallest MADE, 
but suffers from poor forecast in terms of IMADE. This is surprising. Why is it so different between 
IMADE and MADE? This phenomenon may be produced by the non-stationarity of the process. 
For the integrated method, even though the true volatility structure is well captured because of the 
lowest IMADE, extreme values of observations make the MADE quite large. To more accurately 
calibrate the performance of the volatility estimation, we use the 95% up-trimmed mean instead 
of the mean to summarize the values of the measures. Table El reports the trimmed means and the 
relative losses for different measures. The similar conclusions to those in Example 1 can be drawn 
from the table. This shows that our integrated method continues to perform better than other 
for this non-stationary case. The Bayesian estimator performs comparably with the dynamically 
integrated method and outperforms all others. 

5.2 Empirical Study 

In this section, we will apply the integrated volatility estimation methods and others to the 
analysis of real financial data. 

5.2.1 Treasury Bond 

We consider here the weekly returns of three treasury bonds with terms 1, 5 and 10 years, 
respectively. 

We set the observations from January 4, 1974 to December 30, 1994 as in-sample data, and 
those from January 6, 1995 up to August 8, 2003 as out-sample data. The total sample size is 1545 
and the in-sample size is 1096. The results are reported in Table El 

From Table El the integrated estimator is of the smallest MADE and almost the smallest RADE, 
which reflects that the integrated estimation method of the volatility is the best among the five 
methods. Relative losses in MADE of the other estimators with respect to the integrated estimator 
can easily be computed as ranging from 8.47% (NonBay) to 42.6% (Hist) for the bond with one 
year term. For the bonds with 5 or 10 years term, the five estimators have close MADEs and 
RADEs, where the historical simulation method is better than the RiskMetrics in terms of MADE 



Table 6: Comparisons of several volatility estimation methods 



Term 


Measure 


Hist 


RiskM 


Semi 


NonBay 


Integ 


1 year 


MADE 
RADE 
ER 


0.01044 
0.05257 
0.022 


0.00787 
0.04231 
0.020 


0.00787 
0.04256 
0.022 


0.00794 
0.04225 
0.016 


0.00732 
0.04107 
0.038 


5 years 


MADE 
RADE 
ER 


0.01207 
0.05315 
0.007 


0.01253 
0.05494 
0.014 


0.01296 
0.05630 
0.016 


0.01278 
0.05562 
0.011 


0.01201 
0.05572 
0.058 


10 years 


MADE 
RADE 
ER 


0.01041 
0.04939 
0.011 


0.01093 
0.05235 
0.016 


0.01103 
0.05296 
0.018 


0.01112 
0.05280 
0.013 


0.01018 
0.05151 
0.049 



and RADE, and the integrated estimation approach has the smallest MADEs. This demonstrates 
the advantage of using state domain information which can help the time-domain prediction of the 
changes in bond interest dynamics. 

5.2.2 Exchange Rate 

We analyse the daily exchange rate of several foreign currencies with US dollar. The data are 
from January 3, 1994 to August 1, 2003. The in-sample data consists of the observations before 
January 1, 2001, and the out-sample data consists of the remaining observations. The results are 
reported in Table [7| It is seen that the integrated estimator has the smallest MADEs for the ex- 
change rates, which again supports our integrated volatility estimation. 



Table 7: Comparisons of several volatility estimation methods 



Currency 


Measure 


Hist 


RiskM 


Semi 


NonBay 


Integ 




MADE(xlO" 4 ) 


0.614 


0.519 


0.536 


0.519 


0.492 


U.K. 


RADE(xlO~ 3 ) 


3.991 


3.424 


3.513 


3.438 


3.491 




ER 


0.011 


0.017 


0.019 


0.015 


0.039 




MADE(xl0" 4 ) 


0.172 


0.132 


0.135 


0.135 


0.126 


Australia 


RADE(xlO~ 3 ) 


1.986 


1.775 


1.830 


1.797 


1.762 




ER 


0.054 


0.025 


0.026 


0.022 


0.043 




MADE(xl0~ 1 ) 


5.554 


5.232 


5.444 


5.439 


5.067 


Japan 


RADE(xl0~ 1 ) 


3.596 


3.546 


3.622 


3.588 


3.560 




ER 


0.014 


0.011 


0.019 


0.012 


0.029 



6 Conclusions 

We have proposed a Bayesian method and a dynamically integrated method to aggregate the 



information from the time-domain and the state domain. The performance comparisons are studied 
both empirically and theoretically. We have shown that the proposed integrated method is effec- 
tively aggregating the information from both the time and the state domains, and has advantages 
over some previous methods. It is powerful in forecasting volatilities for the yields of bonds and 
for exchange rates. Our study has also revealed that proper use of information from both the time 
domain and the state domain makes volatility forecasting more accurately. Our method exploits the 
continuity in the time-domain and stationarity in the state-domain. It can be applies to situations 
where these two conditions hold approximately. 

7 Appendix 

We collect technical conditions for the proof of our results. 
(Al) <r 2 (x) is Lipschitz continuous. 

(A2) There exists a constant L > such that £|/i(r s )| 2 ( p+<5 ) < L and E\a(r s )\ 2 ( p+ V < L for any 
s £ [t — Tj,t], where rj is some positive constant, p is an integer not less than 4 and 5 > 0. 

(A3) The discrete observations {rti}f = Q satisfy the stationarity conditions of Banon (1978). Fur- 
thermore, the £r2 condition of Rosenblatt (1970) holds for the transition operator. 

(A4) The conditional density Pi{y\x) of rt i+t given rj. is continuous in the arguments (y,x) and is 
bounded by a constant independent of t. 

(A5) The kernel W is a bounded, symmetric probability density function with compact support, 
[-1, 1] say. 

(A6) (N - n)h -> oo, (N - n)h 5 -»■ 0, (JV - n)hA 0. 

Throughout the proof, we denote by M a generic positive constant, and use fj, s and a s to 
represent (jl{t s ) and a(r s ), respectively. 

Proof of Proposition^ It suffices to show that the process {r s } is Holder-continuous with order 
q = (p — l)/(2p) and coefficient K\, where < oo, because this together with assumption 

{Al) gives the result of the lemma. By Jensen's inequality and martingale moment inequalities 
(Karatzas & Shreve 1991, Section 3.3.D), we have 

2(p+<5)\ 



E\r u - r s \ 2{p+S) <M\ E 



fi v dv 



2(p+5) 

+ E 



u 



a v dW v 

< M(u - sfb+V- 1 I E\fi v )\ 2 ^dv + M(u- s) p+8 ^ I E\a v \ 2( P +5 Uv 

< M(u - s) p+s . 



Then by the Kolmogorov continuity theorem (Revuz & Yor 1991, Theorem 2.1), {r s } is Holder- 
continuous. 

Proof of Theorem^ Let Z^ s = (r s — r tl ) 2 ■ Applying Ito formula to Zi s , we obtain 
dZ i)S =2^/ fJ> u du+ / a u dWy)(^fi s ds + a s dW^) + <r 2 ds 
= 2 ( / V-udu + / a u dW u ^j fi s ds + a s ( / // u cfaxJ dW s 
+ 2 ( f a u dW u ) a s dW s + a 2 s ds. 



Then Y 2 can be decomposed as 



Y? = 2ai + 2bi+al 



where 



au = A 



-i 



U+i rs rU+i rs rU+i rs 

fx s ds / fi u du + / n s ds I a u dW u + I CF s dW s I fj, u du 

tj, " t% tj, ti i% "% 

h = A" 1 I 1+1 I a u dW u a s dW s , 



and 



a = A 



fti + l 

aids. 



Therefore, cr 2 ESt can be written as 



a 



ES,t 



2±—^- y \ M a t +2±—±- y a'-^+^A y 

1 - A" ^ 1 - \ n ^ 1 - A n ^ 

i=t—n i=t—n i=t—n 

A n A + B n A + C n \. 



\t-i-l-2 
A G i 



By Proposition 1, as nA — > 0, 

|C n , A - o\\ < K(nA)i, 

where q = (p— l)/(2p). This combined with Lemmas IH2l below completes the proof of the theorem. 
Lemma 1 // condition (A2) is satisfied, then E[A n A ] = O(A). 
Proof . Simple algrbea gives the result. In fact, 



E{a 2 i) < 3E 



+3E 



ti+i 

fj, s ds I fj. u du 

ti+i 



n 2 



ru+i rs 
A -1 / a s dW s J fi u du 



+ 3E 

-, 2 



l 1 / fi s ds I a u dW u 

ti j t. 



U+i 



n 2 



I 1 (A) + J 2 (A) + / 3 (A). 



Applying Jensen's inequality, we obtain that 

7i(A) = OiA-^El t 

l Ju Jt 

= 0(A- X ) f E(jj.i + f4)duds = 0(A). 

J to J to 



fi^Hu du ds 



By Jensen's inequality Holder's inequality and martingale moments inequalities, we have 

rti+i / fS N 2 



7 2 (A) = OiA' 1 ) T +1 E^ s T aldW u 

t% " ti 

- o(A-)/; + > 



ds 





4 




Us 


E 


/ a u dW u 






-Jti J 



4-1 1/2 
| ds = 0(A). 



Similarly, /3(A) = 0(A). Therefore, E(cq) = O(A). Then by the Cauchy-Schwartz inequality and 
noting that n(l — A) = 0(1), we obtain that 

1 — A \ 2 



ml A ] < n {Y^) E A 2(n -^(a?) = 0(A). 



i=l 



Lemma 2 Under condition (A2), if n — > 00 and nA — > 0, i/ien 



Proof. Note that 



where 



t 


S lf 1 V / "'- B 








= ^A- 1 f 




Jti 






/ (cr s - <7 t ) 


1 a u dW u 




J tj 



(Al) 



(W s - W tj .)dW s + ej -, 



dW„ 



By the central limit theorem for martingale (see Hall & Heyde 1980, Corollary 3.1), it suffices to 
show that 

^2 — Z7f~ — 2^ E)2 



t£ = £?[s^n^ |A ] - 1, 
and the following Lyapunov condition holds: 



t-i 



i=t—n 



1 " A -A*-- 1 6 i 



A' 



0. 



(A2) 



(A3) 



Note that 



A 2 

\E{e)) < E 







u, {as - at) 


I a u dW u 




-J tj J 



dW s ] 

E {[ J+1 [[ (°u - o- tt )dW u ~\dW s y 



+0-; 



(A4) 



By Jensen's inequality, Holder's inequality and moments inequalities for martingale, we have 



< 



< 



'3 + 1 



E\ (a s - a t y 



a u dW u 



j 3+1 [E{a s - a t ) A E[( S o u dW^} 



4 1 1/2 



ds 



*j+i 



j.E[ET(nA) 9 ] 4 36A J E^du} 1 ^ ds 



< M(nA) 2q A 2 . 



Similarly, 

By (HU), (EU) and (fA"6|) . 

Therefore, 



£«2 < M(nA) 2? A 2 . 
£(e 2 ) < M(nA) 2 «. 



(A5) 

(A6) 
(A7) 



£[^ 4 & 2 ] = ^ + 0((nA)«). 



1 



By the theory of stochastic calculus, simple algebra gives that E(bj) = and E(bibj) = for i ^ j. 
It follows that 

i=t-n ^ ' 

That is, (|A2|I holds. For (|A3f) . it suffices to prove that E(bj) is bounded, which holds by applying 
the moment inequalities for martingales to bj. 

Proof of Theorem |2J The proof is completed by using the same lines in Fan & Zhang (2003). 
Proof of Theorem |3J By Fan & Yao (1998), the volatility estimator Og t behaves as if the 
instantaneous return function / is known, hence without loss of generality we assume that f(x) = 
and hence & = Y 2 . Let Y = (Y 2 , • • • , Y^_ n _ x ) T , W = diag{^ h (r 4o -r tjv ), • • • , ^(r^^-nj}, 
and 

/ 1 r t - n N 

X = : : 

V i n N _ n _ x - n h 

Denote by m; = E^ 2 ^], m = (mo, • • • ,mjv-n— l) T and ei = (1,0) T . Define Sn = X T WX and 
T N = X T WY. Then it can be written that (see Fan & Yao, 2003) 

°S,t N = e l S N T N- 

Hence 



a s,t N - <?t, 



e^S^X^Wjm - X/3^} + e 1 1 S N i X 1 W(Y - m 



(A8) 



where f3 N = (m(r ijv ), m'(r tN )) T with m(r tjv ) = E[Y 2 \r tl = r tN ]. By Fan & Zhang (2003), the 



bias vector b converges in probability to a vector b with b = 0(h 2 ) = o(l/ ' y (N — n)h). In the 
following, we will show that the centralized vector t is asymptotically normal. 

In fact, put u = (N — n) _1 H _1 X T W(Y — m) where H = diag{l, h}, then by Fan & Zhang 
(2003) the vector t can be written as 

t = P~ 1 (rt JV )H- 1 S" 1 u(l + o p (l)), (A9) 

where S = (pi+j -2) =1,2 with pj = J u J W(u)du. For any constant vector c, define 

j N-n-l 

Q N = c T u = j^— - { y i 2 - m i}C h {r k -r tN ), 

i=0 

where C/i(-) = l/hC(-/h) with C(x) = c VF(x) + cixW(x). Applying the "big-block" and "small- 
block" arguments in Fan & Yao (2003, Theorem 6.3), we obtain 

e-\r tN )yJ(N -n)hQ N JV (0, 1) , (A10) 

where 9 2 (r tN ) = 2p(r tjv )<r 4 (r tjv ) /_ 00 C 2 (u)du. In the following, we will decompose Qjv into two 
parts, Q'^y and Qjy, which satisfy that 

(i) (N - n)hE[6-\r tN )Q' N ¥ < ^ (/i"W(l + o(l)) + (N - njo^" 1 )) - 0. 

(ii) Qjy is identically distributed as Qn and is asymptotically independent of & 2 EStN . 
Define 

^ = £0? " ^PiVtJK^t, - r tjv ), (All) 

i=0 

and 

Q'at = Qn — Q'ni 

where ajv is a positive integer satisfying = o(N — n) and aj^A — > 00. Let $N,e = ^Xi — 
mi)Ch{rti — r t N ), then by Fan & Zhang (2003) 



N-n-2 



V*x[6r\r tN )&N,i] = hr\\ + o(l)) and ^ |Cov(^,i, ^+l)l = o(0, (A12) 

£=1 

which yields the result in (i). This combined with (|A10|) . (i) and IjAlljl leads to 

e-\r tN )^/{N -n)hQ" N ^ N (0,1) . (A13) 

Note that the stationarity conditions of Banon (1978) and the G2 condition of Rosenblatt (1970) 
on the transition operator imply that the p- mixing coefficient p(£) of {r^} decays exponentially, 
and the strong-mixing coefficient a(£) < p(£), it follows that 

\Eexp{it(Q% + a 2 ESjtN )} - E exp{i£(Q%}E exp{^| 5)tiV )}| < 32a(s N ) 0, (A14) 



for any £ 6 R. Using the theorem of Volkonskii & Rozanov (1959), one gets the asymptotic 
independence of o\ st and Q" N . 



By (i), y/ (N — n)hQ' N is asymptotically negligible. This together with Theorem ^ lead to 
die-\r tN U{N-n)hQ N + d 2 V- l,2 ^ls,t N -<y\r tN )}^N(Q,d\+dt), 
for any di, R, where V2 = ^^ja 4 (r tN ). Since Qn is a linear transform of u, 



-^AT(0,/ 3 ), 



where V = blockdiag{Vi, V2} with V\ = 2a 4 (r tN )p(r tN )S* , where S* = (^+^-2)^=1,2 with Vj = 
J u^W 2 {u)du. This combined with (|A9jl gives the joint asymptotic normality oft and &ESt N - Note 
that b = o p (l/ \J (N — n)h), it follows that 



V Vn[a l ESM -a z {r tN )] J 
where £ = diag{2<r 4 (r tjv )^ /p(r ijv ), V^}- Note that <x| t and <t| 5 

tjv are as y m Ptotically indepen- 
dent, it follows that the asymptotical normality of aj tN holds. 
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